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[Claim(s)] 

[Claim 1] The read-out means which reads voice data from the record medium with which voice data was 
recorded, A level adjustment means to adjust the level of the voice data read with the above-mentioned read-out 
means by a predetermined method. The speech processing unit characterized by providing the speech recognition 
means which carries out speech recognition for the voice data after adjusting level with the above-mentioned 
level adjustment means, and an output means to output the recognition result of the above-mentioned speech 
recognition means. 

[Claim 2] The read-out means which reads voice data from the record medium with which voice data was 
recorded. A voice judging means to judge the voice data read with the above-mentioned read-out means to an 
owner sound portion and a silent part, A level adjustment means to adjust the level of the voice data read with 
the above-mentioned read-out means by a predetermined method based on the absolute value of the voice data 
iudged with the above-mentioned voice judging means to be an owner sound portion, The speech processing unit 
characterized by inputting the voice data after adjusting with the above-mentioned level adjustment means, and 
providing the speech recognition means which carries out speech recognition, and an output means to output the 
recognition result of the above-mentioned speech recognition means. 

[Claim 3] The speech processing unit according to claim 2 characterized by being set up based on the minimum in 
which the minimum computer means which calculates the minimum of the energy of the voice data of the 
predetermined section was provided further, and the criterion of the above-mentioned voice judging means 
calculated it by the above-mentioned minimum computer means. 

[Claim 4] By computer, are the voice recognition program for carrying out speech recognition the recorded record 
medium, and [ the above-mentioned voice recognition program ] Voice data is made to read from the record 
medium with which voice data was recorded on the computer, the record medium which recorded the voice 
recognition program which is made to carry out speech recognition for the voice data after making the level of 
the voice data which carried out [ above-mentioned ] reading appearance adjust and adjusting the above- 
mentioned level, and is characterized by making the above-mentioned speech recognition result output. 
[Claim 5] Are the processing program for carrying out processing which passes voice data to a voice recognition 
program by computer the recorded record medium, and [ the above-mentioned processing program ] the record 
medium which recorded the processing program characterized by making the voice data after making voice data 
read to a computer, making the level of the voice data which carried out [ above-mentioned ] reading appearance 
adjust to it from the record medium with which voice data was recorded and adjusting the above-mentioned level 
to it pass to a voice recognition program. 



[Detailed Description of the Invention] 
[0001] 

[Field of the Invention] the record medium as for which this invention recorded the speech processing unit and 
the voice recognition program, and the record medium which recorded the processing program — in more detail It 
is related with the record medium which recorded the processing program for carrying out processing which 
passes voice data to a voice recognition program by the record medium which recorded the voice recognition 
program for carrying out speech recognition by the speech processing unit and computer which process voice 
data, and computer. 
[0002] 

[Description of the Prior Art] If voice data is inputted what is called a voice word processor or by stating orally 
Based on this voice data, a document is drawn up automatically, the dictation system implementation which 
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Jisplays it on a screen etc. is one target in the speech recognition system development from the former, and 
esearch and development are furthered actively now. 

.0003] A microphone is connected to a personal computer with an advance of such speech recognition 
:echnology in recent years, the equipment which documents the voice inputted using this microphone on this 
Dersonal computer, and displays it on a screen is developed, and, generally it is marketed. 
0004] On the other hand, in drawing up a document, oral statement sound recording of the content of the 
Jocument to draw up is once carried out conventionally at sound recording equipment, such as a tape recorder. 
/Vhile a secretary, a typist, etc. play the content of oral statement later, it has become common as one of the 
effective form of utilization of sound recording equipment, such as a tape recorder, to take the form of 
documenting with document preparation equipment, such as a typewriter and a word processor. 
10005] In the form of utilization dictated using such sound recording equipment, realization of the technology of 
changing the content of sound recording into a document automatically is strongly desired from before. 
]0006] [ with moreover, development of computer technology in recent years, digital-signal-processing 
lechnology, etc. ] Digitahdata-ize the content of sound recording, and record on the record medium in which the 
writing and elimination of a flash memory etc. are possible. It is possible for what is called a digital recorder to 
^omes to be developed, to transmit the digitized content of sound recording to a personal computer further, and 
:o play the content of sound recording in this personal computer. 

.0007] These people are developing the processing control unit of the voice data which makes it possible to treat 
:he recorded data transmitted from such a digital recorder by easy operation on a. personal computer, and have 
Droposed in Tokuganhei9-1 49728. 

.0008] Furthermore, these people pass and do speech recognition of the voice data by which digital recording was 
::arried out to a voice recognition unit from the processing control unit of the above-mentioned voice data, the 
dictation system displayed on a screen as a document is developed, and it has proposed in Tokuganhei9-1 49729. 
[0009] According to such a dictation system, it is not necessary to sit down in front of a computer and to carry 
Dut direct voice input, it once records to a digital recorder, and it becomes possible to transmit the recorded data 
to a computer later, and to make a document draw up. 

rOOlO] By the way, in order to raise the performance of speech recognition, it is required for voice inputting level 
to be proper. Under the present circumstances, it is difficult to guarantee a high recognition rate over the large 
'^ange from a low to a high level, and it must be considered as setting out which can obtain the recognition rate 
greatest with an average sound level as equipment after all. 

[0011] Then, the speaking person is made to control oneself by displaying the level meter which shows the height 
of a sound level to a screen etc. in the voice recognition unit of the form which performs voice input, for example 
from a microphone which was mentioned above, so that it may be in a state with a proper sound level. 
[0012] As an example of such technology, to JP,H5-231922,A. the 1st ****** for sound signal reception, A 
sound-pressure-level ratio calculation means to ask for the ratio of the 2nd ****** which receives the noise 
signal near this 1st ******, and the sound pressure level inputted into said 1st ****** to the sound pressure 
level inputted into the 2nd ******, The sound-pressure-level drop for voice recognition units which has a display 
means to display tine ratio of the sound pressure level called for with this sound-pressure-level ratio calculation 
means is indicated. 
[0013] 

[Problem to be solved by the invention] However, in the dictation system which passes voice data which was 
mentioned above, and by which digital recording was carried out to a voice recognition unit from a processing 
control unit, is made to carry out speech recognition, and is displayed on a screen by using the recognized result 
as a document, the already recorded voice data serves as an input to a voice recognition unit. Therefore, it did 
not have the function to be unable to distinguish whether the voice data already recorded is proper as an input 
level to a voice recognition unit, and to adjust a sound level automatically, either. For this reason, the recognition 
rate of speech recognition might change with the level of the recorded voice data a lot. 

[0014] It aims at offering the speech processing unit, the record medium which recorded the voice recognition 
program, and the record medium which recorded the processing program which becomes possible [ performing 
speech recognition which this invention was made in view of the above-mentioned situation, and was not based 
on the level of the recorded voice data, but was stabilized ]. 
[0015] 

[Means for solving problem] [ the speech processing unit by the 1 st invention ] in order to attain the above- 
mentioned object The read-out means which reads voice data from the record medium with which voice data was 
recorded, It has a level adjustment means to adjust the level of the voice data read with the above-mentioned 
read-out means by a predetermined method, the speech recognition means which carries out speech recognition 
for the voice data after adjusting level with the above-mentioned level adjustment means, and an output means to 
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output the recognition result of the above-mentioned speech recognition means. 

[0016] Moreover, a read-out means by which the speech processing unit by the 2nd invention reads voice data 
From the record medium with which voice data was recorded, A voice judging means to judge the voice data read 
mth the above-mentioned read-out means to an owner sound portion and a silent part, A level adjustment means 
to adjust the level of the voice data read with the above-mentioned read-out means based on the absolute value 
of the voice data judged with the above-mentioned voice judging means to be an owner sound portion by a 
predetermined method, It has the speech recognition means which inputs the voice data after adjusting with the 
above-mentioned level adjustment means, and carries out speech recognition, and an output means to output the 
recognition result of the above-mentioned speech recognition means. 

[0017] Furthermore, the speech processing unit by the 3rd invention is set to the speech processing unit by the 
2nd above-mentioned invention. The minimum computer means which calculates the minimum of the energy of 
the voice data of the predetermined section is provided further, and the criterion of the above-mentioned voice 
judging means is set up based on the minimum calculated by the above-mentioned minimum computer means. 
[0018] [ and the record medium which recorded the voice recognition program by the 4th invention ] By 
computer, are the voice recognition program for carrying out speech recognition the recorded record medium, and 
[ the above-mentioned voice recognition program ] speech recognition is carried out for the voice data after 
making the level of the voice data which was made to read voice data from the record medium with which voice 
data was recorded to a computer, and carried out [ above-mentioned ] reading appearance to it adjust and 
adjusting the above-mentioned level, and the above-mentioned speech recognition result is made to output 
[0019] [ in addition, the record medium which recorded the processing program by the 5th invention ] Are the 
processing program for carrying out processing which passes voice data to a voice recognition program by 
computer the recorded record medium, and [ the above-mentioned processing program ] the voice data after 
making the level of the voice data which was made to read voice data from the record medium with which voice 
data was recorded to a computer, and carried out [ above-mentioned ] reading appearance to it adjust and 
adjusting the above-mentioned level is made to pass to a voice recognition program 

[0020] Therefore, as for the speech processing unit by the 1st invention, a read-out means reads voice data from 
the record medium with which voice data was recorded. Speech recognition is carried out for the voice data after 
it adjusts the level of the voice data which the level adjustment means read with the above-mentioned read-out 
means by a predetermined method and a speech recognition means adjusts level with the above-mentioned level 
adjustment means, and an output means outputs the recognition result of the above-mentioned speech 
recognition means. 

[0021] Moreover, as for the speech processing unit by the 2nd invention, a read-out means reads voice data from 
the record medium with which voice data was recorded The voice data which the voice judging means read with 
the above-mentioned read-out means is judged to an owner sound portion and a silent part. The level of the 
voice data which the level adjustment means read with the above-mentioned read-out means based on the 
absolute value of the voice data judged with the above-mentioned voice judging means to be an owner sound 
portion is adjusted by a predetermined method. The voice data after a speech recognition means adjusts with the 
above-mentioned level adjustment means is inputted, speech recognition is carried out, and an output means 
outputs the recognition result of the above-mentioned speech recognition means. 

[0022] Furthermore, the speech processing unit by the 3rd invention is set up based on the minimum which the 
minimum computer means calculated the minimum of the energy of the voice data of the predetermined section, 
and the criterion of the above-mentioned voice judging means calculated by the above-mentioned minimum 
computer means. 

[0023] [ and the record medium which recorded the voice recognition program by the 4th invention ] Are for 
carrying out speech recognition by computer, and [ the above-mentioned voice recognition program ] speech 
recognition is carried out for the voice data after making the level of the voice data which was made to read 
voice data from the record medium with which voice data was recorded to a computer, and carried out [ above- 
mentioned ] reading appearance to it adjust and adjusting the above-mentioned level, and the above-mentioned 
speech recognition result is made to output 

[0024] [ in addition, the record medium which recorded the processing program by the 5th invention ] Are for 
carrying out processing which passes voice data to a voice recognition program by computer, and [ the above- 
mentioned processing program ] the voice data after making the level of the voice data which was made to read 
voice data from the record medium with which voice data was recorded to a computer, and carried out [ above- 
mentioned ] reading appearance to it adjust and adjusting the above-mentioned level is made to pass to a voice 
recognition program 
[0025] 

[Mode for carrying out the invention] With reference to Drawings, the form of operation of this invention is 
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3xplained hereafter. Drawing 6 shows 1 operation form of this invention from drawing 1 , and drawing 1 is notional 
3ntire configuration drawing of the dictation system by which this invention is applied. 

[0026] The digital recorder 1 which changes voice into an electrical signal and voice-data-izes it as this dictation 
system is shown in drawing 1 , Record-medium slack Miniature Card 2 which equips this digital recorder 1 
'emovable, is used for it. and records the above-mentioned voice data. The PC card adapter 3 for inserting in PC 
Sard slot 40 (referring to drawing 2 ) which mentions this Miniature Card 2 later, and making connection possible. 
. the voice data which was equipped with the output means slack display 5. the keyboard 6. and the mouse 7 
grade, and was obtained from above-mentioned Miniature Card 2 through above-mentioned PC Card slot 40 ] It 
nas the personal computer 4 as a speech processing unit which performs processing by the control program 8 or 
the voice recognition program 9, and is constituted. 

.0027] Next, drawing 2 is the block diagram showing the electric composition of the above-mentioned personal 
computer 4. 

"0028] [ the personal computer ] while this personal computer 4 performs sound reproduction, an information 
display, etc. according to the above-mentioned control program 8 and performs document preparation etc. 
according to the above-mentioned voice recognition program 9 CPU31 which performs various processings 
according to various kinds of other programs, and served both as a read-out means, a level adjustment means, a 
speech recognition means, a voice judging means, the minimum computer means, the gain value computer means, 
the multiplication means, and the average computer means. The record-medium slack main memory 32 used as 
the working area of this CPU31. For example, the record-medium slack internal recording medium 33 it becomes 
by the hard disk, a floppy disk, etc. and with which the above-mentioned control program 8 and the voice 
recognition program 9 are recorded. The external port 34 for connecting with various kinds of external 
instruments, and the interface (it abbreviates to IF hereafter) 35 which connects the above-mentioned display 5, 
[F36 which connects the above-mentioned keyboard 6 and a mouse 7, and the loudspeaker 38 which utters voice 
based on voice data, Have, and IF37 which connects this loudspeaker 38, PC Card slot 40 in which Miniature Card 
2 with which the above-mentioned PC card adapter 3 was equipped is inserted, and IF39 for connecting this PC 
Card slot 40 are constituted, and The above CPU 31 Main memory 32, the internal recording medium 33, the 
external port 34, IF35. 36. 37, and 39 are mutually connected through the bus. 

[0029] In addition, although you may make it read voice data from Miniature Card 2 directly through above- 
mentioned PC Card slot 40, it once records on the above-mentioned internal recording medium 33. It may be 
made to read from this internal recording medium 33. or it does not matter even if it makes it read from a digital 
recorder 1 directly through means of communications etc. 

[0030] In a dictation system, drawing and drawing 4 of drawing 3 which show the whole flow when reading a voice 
memory to voice data and carrying out speech recognition are a flow chart which shows processing of the speech 
recognition in a dictation system. 

[0031] If processing is started as shown in drawing 4 , the voice data currently recorded In the file unit will be 
read from the voice memory 1 1 of above-mentioned Miniature Card 2 or the above-mentioned internal- 
recording-medium 33 grade, and decoding processing 12 will be performed (Step SI). 

[0032] The result of this decoding processing 12 is sent to an owner sound / silent decision processing 13, and 
the sample average-absolute-value value computation 14. 

[0033] And next, by an owner sound / silent decision processing 13, while performing computation of an owner 
sound / silent judging threshold (Step S2). based on the calculated threshold, an owner sound / silent decision 
processing is performed (Step S3). These processings are explained in detail in drawing 5 mentioned later. The 
result of this owner sound / silent decision processing 13 is sent to the above-mentioned sample average- 
absolute-value value computation 14. 

[0034] Then, the above-mentioned sample average-absolute-value value computation 14 and the gain 
computation 15 perform processing which calculates gain (Step S4). This processing is explained in detail in 
drawing 6 mentioned later. Based on the gain value calculated by the above-mentioned gain computation 1 5. the 
output of the above-mentioned decoding processing 12 is amplified in the gain multiplication processing 16 (Step 
85). 

[0035] The voice data adjusted to suitable level by this gain multiplication processing 16 is sent to the speech 
recognition processing 17. and speech recognition is performed (Step S6). 

[0036] And the transliteration of changing the result of this speech recognition into a character code is 
performed (Step S7), the changed character code is outputted, and display 18 is taken for the screen of the 
above-mentioned display 5 grade (Step S8). 

[0037] In addition, although the speech recognition result is displayed on a display 5 as a character here, this 
invention is not limited to this. 

[0038] Drawing 5 is a flow chart which shows the content of processing concerning the owner sound / silent 
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udgment in Step S2 and Step S3 of above-mentioned drawing 4 . 

]0039] If this processing starts, the variable f which shows the counted value of a frame number will be first 
nitialized to 0 (Step S11). 

*0040] Next, after incrementing Variable f, frame energy e (f) is calculated with (Step SI 2) and the formula of a 
graphic display (Step SI 3). In addition, an input signal [ in / in s (i) / the sample of eye watch (i-1) in one frame ] 
and N show among the formula the frame number which constitutes one frame. 

[004T] Next, it judges whether it is the frame of whether the value of Variable f is 1, and the first stage (Step 
SI 4), and when f is 1. the value of the variable min which shows the minimum frame energy is set to e (1) (Step 
SI 6).. 

[0042] moreover, when f is not 1 in the above-mentioned step SI 4 It judges whether frame energy e (f) is smaller 
than Variable min (Step S15), in being small, it sets frame energy e (f) to Variable min (Step S17), and on the 
other hand, in not being small, it goes to the following step S18, without doing anything as it is. 
[0043] And it judges whether the file reached termination (Step SI 8). and in not being termination still, it repeats 
the processing returned and mentioned above to the above-mentioned step SI 2. 

[0044] Moreover, when it is judged that the end of file was reached in this step SI 8, the value which integrated 
the predetermined value alpha (for example, 1.8) is set to the above-mentioned variable min as a threshold trs 
(Step SI 9), and it escapes from this processing. 

[0045] Since such a processing method of threshold setting out can use effectively that voice data is already 
recorded and can determine a threshold based on the threshold energy of the whole file, it becomes possible [ ** 
(ing) to little owner sound / silent judging of an error ]. 

[0046] In addition, although the minimum of the read entire interval (that is, all the frames which constitute a 
voice file) is calculated in even if this invention is not limited to this and is no minimum of the sections, it 

should just be the section of a certain amount of length. 

[0047] Then, drawing 6 is a flow chart which shows the content of the gain computation in Step S4 of above- 
mentioned drawing 4 . 

[0048] If this processing starts, the variable Cnt which shows the variable SumAbs which shows the aggregate 
value of the variable f which shows the counted value of a frame number, and a sample aibsolute value, and the 
number of times of addition will be respectively initialized to 0 (Step S21). 

[0049] Next, frame energy e (f) which incremented Variable f (Step S22) and calculated it in drawing 5 mentioned 
above judges whether it is larger than the threshold trs (Step S23). In being larger than the threshold trs, frame 
energy e (f) adds the sum total of the sample absolute value of a frame to the variable SumAbs itself (Step S24), 
and increments Variable Cnt here (Step S25). 

[0050] Moreover, when frame energy e (f) is below a threshold in the above-mentioned step S23, it goes to the 
following step S26 as it is. 

[0051] Next, it judges whether the file reached termination (Step S26X and in not being termination still, it repeats 
the processing returned and mentioned above to the above-mentioned step S22. 

[0052] Moreover, when it is judged that the end of file was reached in this step S26, the average average of the 
sample absolute value of a frame is calculated by dividing the above-mentioned variable SumAbs by Variable Cnt 
(Step S27). 

[0053] And Gain gain is calculated by breaking the predetermined value LEV by this average average (Step S28). 
The average of the sound sample absolute value which it is set as the average of the assumed sample absolute 
value, for example, was used for the study voice data in the speech recognition section is used for this 
predetermined value LEV here. 

[0054] Since it can adjust to the level which was suitable for speech recognition to the voice data already 
recorded according to such an operation form, it becomes possible to perform speech recognition which was not 
based on the level of the recorded voice data, but was stabilized, and becomes a quality dictation system. 
[0055] In addition, as for this invention, it is needless to say for various deformation and application to be possible 
within limits which are not limited to each operation form mentioned above, and do not deviate from the main 
point of invention. 

[0056] [Additional remark] According to the above-mentioned operation form of this invention which was 
explained in full detail above, composition can be obtained at the following times. 

[0057] (1) The read-out means which reads voice data from the record medium with which voice data was 
recorded. A voice judging means to judge the voice data read with the above-mentioned read-out means to an 
owner sound portion and a silent part. The average computer means which calculates the average of the absolute 
value of the voice data judged with the above-mentioned voice judging means to be an owner sound portion, The 
gain value computer means which calculates a gain value based on the above-mentioned average, and the. 
multiplication means which multiplies voice data by the above-mentioned gain value, The speech processing unit 

http://dossieri;ipdl.ncipi.go-ip/cgi-bin/tran_vveb_cgi_ejje?u=http%3A%2 24-05-2005 



JP, 11-212595, A (1999) [FULL CONTENTS] Page 6 of 10 

::haracterized by providing the speech recognition means which carries out speech recognition for the voice data 
after carrying out the multiplication of the above-mentioned gain, and an output means to output the recognition 
'esult of the above-mentioned speech recognition means. 

.0058] (2) A read-out means by which the voice data by which frame division was digitized and carried out reads 
:he voice data of a desired file from the record medium recorded per file. In the frame judged with a voice judging 
Tieans to judge the voice data read with the above-mentioned read-out means to a frame unit at an owner sound 
rams and a silence frame, and the above-mentioned voice judging means to be an owner sound frame The 
average computer means which calculates the average of the absolute value of voice data, and the gain value 
::omputer means which calculates a gain value based on the above-mentioned average, The speech processing 
jnit characterized by providing the multiplication means which carries out the multiplication of the above- 
nentioned gain value to the above-mentioned voice data, the speech recognition means which carries out speech 
recognition for the voice data after carrying out the multiplication of the above-mentioned gain, and an output 
Tieans to output the recognition result of the above-mentioned speech recognition means. 
.0059] (3) By computer, are the voice recognition program for carrying out speech recognition the recorded 
^ecord medium, and [ the above-mentioned voice recognition program ] Voice data is made to read from the 
'ecord medium with which voice data was recorded on the computer, based on the absolute value of the voice 
data which the owner sound portion and the silent part were made to judge the voice data which carried out 
[ above-mentioned ] reading appearance, and was judged to be the above-mentioned owner sound portion the 
^ecord medium which recorded the voice recognition program which is made to carry out speech recognition for 
the voice data after making the level of the voice data which carried out [ above-mentioned ] reading appearance 
adjust by a predetermined method and carrying out [ above-mentioned ] level adjustment, and is characterized by 
making the above-mentioned speech recognition result output. 

[0060] (4) Are the processing program for carrying out processing which passes voice data to a voice recognition 
program by computer the recorded record medium, and [ the above-mentioned processing program ] Voice data 
is made to read from the record medium with which voice data was recorded on the computer, based on the 
absolute value of the voice data which the owner sound portion and the silent part were made to judge the voice 
data which carried out [ above-mentioned ] reading appearance, and was judged to be the above-mentioned 
owner sound portion the record medium which recorded the processing program characterized by making the 
voice data after making the level of the voice data which carried out [ above-mentioned ] reading appearance 
adjust by a predetermined method and carrying out [ above-mentioned ] level adjustment pass to a voice 
recognition program. 

[0061] (5) By computer, are the voice recognition program for carrying out speech recognition the recorded 
record medium, and [ the above-mentioned voice recognition program ] Voice data is made to read from the 
record medium with which voice data was recorded on the computer. [ the average of the absolute value of the 
voice data which the owner sound portion and the silent part were made to judge the voice data which carried 
out / above-mentioned / reading appearance, and was judged to be the above-mentioned owner sound portion is 
made to calculate, and ] The record medium which recorded the voice recognition program which inputs the voice 
data after making a gain value calculate based on the above-mentioned average, carrying out the multiplication of 
the above-mentioned gain value to voice data and carrying out the multiplication of the above-mentioned gain, is 
made to carry out speech recognition, and is characterized by making the above-mentioned speech recognition 
result output. 

[0062] (6) Are the processing program for carrying out processing which passes voice data to a voice recognition 
program by computer the recorded record medium, and [ the above-mentioned processing program ] Voice data 
is made to read from the record medium with which voice data was recorded on the computer. [ the average of 
the absolute value of the voice data which the owner sound portion and the silent part were made to judge the 
voice data which carried out / above-mentioned / reading appearance, and was judged to be the above- 
mentioned owner sound portion is made to calculate, and ] The record medium which recorded the processing 
program characterized by making the voice data after making a gain value calculate based on the above- 
mentioned average, carrying out the multiplication of the above-mentioned gain value to voice data and carrying 
out the multiplication of the above-mentioned gain pass to a voice recognition program. 

[0063] Therefore, after calculating the gain value based on the average of the absolute value of the owner sound 
portion in voice data and adjusting the level of voice data according to invention given in an additional remark (1). 
in order to perform speech recognition, it becomes possible to perform speech recognition which was not based 
on the level of the recorded voice data, but was stabilized. 

[0064] Moreover, after calculating the gain value based on the average of the absolute value of the owner sound 
frame in voice data and adjusting the level of voice data according to invention given in an additional remark (2), 
in order to perform speech recognition, it becomes possible to perform speech recognition which was not based 
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Dn the level of the recorded voice data, but was stabilized. 

,0065] In order [ furthermore, ] for a voice recognition program to make the level of voice data adjust to it based 
Dn the absolute value of the owner sound portion in voice data according to invention given in an additional 
-emark (3) before performing speech recognition to a computer It becomes possible to perform speech 
-ecognition which was not based on the level of the recorded voice data, but was stabilized. 
!0066] After a processing program's making the level of voice data adjust to a computer based on the absolute 
✓alue of the owner sound portion in voice data, in order to make voice data pass to a voice recognition program 
according to invention given in an additional remark (4) It becomes possible to perform speech recognition which 
NBS not based on the level of the recorded voice data, but was stabilized. 

]0067] In order to make the level of voice data adjust according to invention given in an additional remark (5) 

Defore a voice recognition program makes a computer calculate a gain value based on the average of the absolute 

>/a\ue of the owner sound portion in voice data and performs speech recognition It becomes possible to perform 

speech recognition which was not based on the level of the recorded voice data, but was stabilized. 

[0068] According to invention given in an additional remark (6), a processing program [ a computer ] After making 

3 gain value calculate based on the average of the absolute value of the owner sound portion in voice data and 

making the level of voice data adjust, in order to make voice data pass to a voice recognition program, it becomes 

possible to perform speech recognition which was not based on the level of the recorded voice data, but was 

stabilized. 

[0069] 

[Effect of the Invention] In order to adjust the level of voice data according to the speech processing unit of this 
invention by Claim 1 before performing speech recognition as explained above, it becomes possible to perform 
speech recognition which was not based on the level of the recorded voice data, but was stabilized. 
[0070] Moreover, in order according to the speech processing unit of this invention by Claim 2 to adjust the level 
of voice data based on the absolute value of the owner sound portion in voice data before performing speech 
recognition, it becomes possible to perform speech recognition which was not based on the level of the recorded 
voice data, but was stabilized. 

[0071] Furthermore, while doing so the same effect as invention according to claim 2 according to the speech 
processing unit of this invention by Claim 3, in order to take the minimum of the energy of voice data into 
consideration, a more suitable voice judging can be performed. 

[0072] And in order for a voice recognition program to make the level of voice data adjust to it according to the 
record medium which recorded the voice recognition program of this invention by Claim 4 before performing 
speech recognition to a computer, it becomes possible to perform speech recognition which was not based on the 
level of the recorded voice data, but was stabilized. 

[0073] After a processing program's making the level of voice data adjust to a computer, in order [ in addition, ] 
to make voice data pass to a voice recognition program according to the record medium which recorded the 
processing program of this invention by Claim 5 It becomes possible to perform speech recognition which was not 
based on the level of the recorded voice data, but was stabilized. 



[Brief Description of the Drawings] 

[Drawing 1] Notional entire configuration drawing of the dictation system of 1 operation form of this invention. 
[Drawing 2] The block diagram showing the electric composition of the personal computer of the above- 
mentioned operation form. 

[Drawing 3] Drawing showing the whole flow when reading and carrying out speech recognition of the voice data 
from a voice memory in the dictation system of the above-mentioned operation form. 

[Drawing 4] The flow chart which shows processing of the speech recognition in the dictation system of the 
above-mentioned operation form. 

[Drawing 5] The flow chart which shows the content of processing concerning the owner sound / silent judgment 
in above-mentioned drawing 4 . 

[Drawing 6] The flow chart which shows the content of the gain computation in above-mentioned drawing 4 . 
[Explanations of letters or numerals] 

1 — Digital recorder 

2 — Miniature Card (record medium) 

4 — Personal computer (speech processing unit) 

5 — Display (output means) 
8 — Control program 
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) — Voice recognition program 

1 1 — Voice memory 

1 2 — Decoding processing 

13 — An owner sound / silent decision processing 

1 4 — Sample average-absolute-value value computation 

15 — Gain computation 

1 6 ~ Gain multiplication processing 

17 — Speech recognition processing 

18 — Display 

31 — CPU (a read-out means, a level adjustment means, a speech recognition means, a voice judging means, a 
Tiinimum computer means, a gain value computer means, a multiplication means, average computer means) 

32 — Main memory (record medium) 

33 — Internal recording medium (record medium) 
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[Drawing 2] 
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[Drawing 4] 
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[Drawi ng 5] 
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[Drawing 6] 



http://dossier1 jpdLncipi.goop/cgi-bin/tran_web_cgi_ejje?u=http%3A%2F%2Fdossier1jpdLnc..^ 24-05-2005 



JP,.j 1-21 2595, A (1999) [FULL CONTENTS] 

' C fe a ga te) 



Page 10 of 10 



S21 



I 



f-0, 

SumAbs-O, 
Cnt«0 




S24- 



N-1 

SuniAI>s+« Zls(i) I 

i-O 

I 




^ cveroge'SunoAbs/Cnt 
528 ' 



4 gain » LEV/ average 



[Translation done.] 
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